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ABSTRACT. We introduce fully nonparametric two-sample tests for testing the null hypothesis 
that the samples come from the same distribution if the values are only indirectly given via 
current status censoring. The tests are based on the likelihood ratio principle and allow the 
observation distributions to be different for the two samples, in contrast with earlier proposals 
for this situation. A bootstrap method is given for determining critical values and asymptotic 
theory is developed. A simulation study, using WeibuU distributions, is presented to compare 
the power behavior of the tests with the power of other nonparametric tests in this situation. 
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1 Introduction 

At the beginning of the vast amount of research on right-censored data, there was much interest 
in two-sample tests for right-censored data, like the Gehan test, log rank test, Efron's test, 
etc. For example, Gehan (1965) considers the testing problem of testing Fi = F2 against the 
alternative Fi < F2, and gives a permutation test for this testing problem. 

Permutation tests for the two-sample problem with interval censored data have been consid- 
ered in Peto and Pcto (1972). Since they rely on the permutation distribution, such tests can 
only be used when the censoring mechanism is the same in both samples. One of the referees of 
this paper asked the interesting question whether permutation tests of this type, considered as 
conditional tests, might be asymptotically independent of the observation distributions in the 
two samples, in analogy with results in Neuhaus (1993) for two-sample tests in the presence 
of right censoring. I do not know the answer to this question (current status censoring is very 



different from right censoring!), but preliminary results indicate that this method gives very 
variable estimates of the critical values for moderate sample sizes and therefore cannot be used 
for these sample sizes. The bootstrap method we propose for computing the critical values does 
not suffer from this drawback, see section 6. 

The maximum likelihood estimator for interval censored data is considered in more detail in 
Peto (1973), where it is suggested that pointwise standard errors for the survival curve can be 
estimated from the inverse of the Fisher information. However, we know by now for a long time 
that this is not correct if we sample from continuous distributions; the pointwise asymptotic 
distribution is not normal, and the asymptotic variance is not given by the the inverse of the 
Fisher information, see, e.g., Groeneboom and Wellner (1992) (I owe this observation on Peto 
(1973) to Peter Sasieni). 

Other tests have been considered in, e.g., Andersen and R0nn (1995) and Sun (2006), where 
also references to earlier work by the latter author can be found. They are based on certain 
functionals of the distributions which will be different from zero for some alternatives (mostly 
of the type of "shift alternatives"). Similar tests have been considered in Zhang et al. (2011) 
and Zhang (2006) for panel count data, where pseudo maximum likelihood estimators are used. 
Specialized to our present problem, this leads to tests of the same type as the tests in Andersen 
and R0nn (1995) and Sun (2006). 

We consider here rather different types of tests which are likelihood ratio based tests for 
testing that two samples come from the same distribution, if current status censoring is present. 
A test of this type is considered in Chapter 3 of Kulikov (2003), where the null hypothesis 
of equality of the distribution functions Fi and F2, generating the first and second sample, 
respectively, is tested against Lehmann alternatives of the form 

F2(^) = i^l(^)'+^^e(-l,oo)\{o}. (i.i) 

Here we prefer to test the null hypothesis of equality of Fi and F2 just against the more 
general alternative that they are not equal. Note that in testing against the Lehmann alterna- 
tives (1.1), we have to estimate Fi and 9, whereas in the more general testing problem we have 
to estimate both Fi and F2 nonparametrically. 

We will assume the visual conditions for the current status model with continuous distri- 
butions, as stated on p. 35 of Groeneboom and Wellner (1992): (Xi, Ti), . . . , (X„j, T„i) and 
(X„,,+i, r„i_|_i), . . . , (Xjv, Tn), N = m + n, arc two independent samples of random variables 
in M'^, where Xi and Ti are independent, with, respectively, continuous distribution functions 
Fi and Gi in the first sample and contimious distribution functions F2 and G2 in the second 
sample. We call the Xi the "hidden" variables and the Ti the observation variables. Note that 
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wc allow the distribution functions Gi and G2 of the observation variables to bo different in 
the two samples. In the current status model, the only observations which are available to us 
are the pairs 

(r„A,), A, = i{x^<Tj, 

so we do not observe X,; itself, but only its "current status" A^. In this situation, we want to 
test the null hypothesis that the distribution functions of the hidden variables are the same in 
the two samples. 

We first discuss what a simple likelihood ratio test would look like. Under the null hypothesis 
we have to maximize 

N 

{Ai log F{Ti) + (1 - Ai) log (1 - F{Ti))} , N = m + n, 

i=l 

over all distribution functions F, and without the restriction of the null hypothesis we have to 
maximize 

{Ai logFi(TO + (1 - Ai) log (1 - F,{Ti))} 

i=l 

N 

+ Y {AilogF2(Ti) + (l-Ai)log(l-F2(Ti))} 



over all pairs of distribution functions (Fi,F2). 

This means that under the null hypothesis the MLE (maximum likelihood estimator) is 
given by the left-continuous slope of the greatest convex minorant of the cusum diagram of the 
points (0, 0) and the points 

(^,J2^U)^,i = h-..,N. (1.2) 

using a notation, introduced in Groeneboom and Wellner (1992). Here Aq) denotes the indica- 
tor corresponding to the jth order statistic T(j). Without the restriction of the null hypothesis 
the MLE of Fi is given by the left-continuous slope of the greatest convex minorant of the 
cusum diagram of the points (0, 0) and the points 




(1.3) 

where A(ji) is the indicator corresponding to jth order statistic Tyij of the first sample. Sim- 
ilarly the MLE of F2 is given by the left-continuous slope of the greatest convex minorant of 
the cusum diagram of the points (0, 0) and the points 

hY\i2)\ ,i = l,...,n, (1.4) 



3 



where A(j2) is the indicator corresponding to jtli order statistic T(^j2) of the second sample. 

Let the MLE of Fi (= F2) under the null hypothesis be given by F^, and let the MLE of 
the pair (^1,^2) without the restriction of the null hypothesis be given by 

Then the log likelihood ratio test statistic is given by: 

where the terms with coefRcients Aj and 1 — Aj are defined to be zero if Aj and 1 — Aj are 
zero, respectively. 

Although we take this statistic as our inspiration, we first study a statistic somewhat similar 
to this LR statistic, based on maximum smoothed likelihood estimators (MSLEs), introduced 
in Groeneboom et al. (2010). One of the reasons is that the asymptotic analysis of the original 
LR statistic is rather involved; the difficulty in analyzing the limit properties of (1.5) lies in the 
problem of finding a normalization making it an asymptotic pivot under the null hypothesis. 
One also has to deal with the non-standard asymptotics, which derives from the fact that the 
statistic is based on (non-linear) isotonic estimators which satisfy an order restriction. These 
non-standard features also turn up in the limit behavior. Another reason is that the MSLE 
leads to more powerful tests for models, commonly used in this type of comparisons. This 
will be illustrated by a simulation study for a two parameter WeibuU distribution, also used in 
Andersen and R0nn (1995) in a simulation study to check the power of their proposed test. 

Maximum smoothed likelihood estimators for current status data were studied in Groene- 
boom et al. (2010), where it was shown that, under some regularity conditions, the local limit 
distribution is normal (in contrast with the limit behavior of the original MLE). These es- 
timators are obtained by first smoothing the observation distribution, for example by kernel 
estimators, and next maximizing the smoothed likelihood w.r.t. the distribution of the hidden 
variables. In this way the MSLE inherits smoothness properties of the estimate of the observa- 
tion distribution and converges at a faster rate than the "raw" MLE, which locally converges 
at rate n~^/^ under the usual smoothness conditions on the underlying distributions. Further 
results on the MSLE can be found in Groeneboom et al. (2010). 

A picture of the MSLE estimators and the MLE estimators for samples of size 250 from two 
different WeibuU distributions with densities 

aiAa;"i-ie-^^"' , a2Aa;"^"^e-^^"' , a; > 0, ai = 0.5, a2 = 2, A = 1.6, (1.6) 
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respectively, where ai = 0.5 holds for the first sample and a2 — 2 for the second sample, is 
shown in Figure 1. 




0.0 0.5 1.0 1.5 2.0 0.0 0.5 1.0 1.5 2.0 



Figure 1: MSLEs and MLEs on [a, b] for samples of size m ~ n ~ 250 from the WeibuU densities 
(1.6). Gi and G2 are uniform on [0,2], and the interval [a,b] = [0.1,1.9]. The left panel gives 
the MSLE estimates and the right panel the MLEs, where the dashed curves give the estimates 
for the first sample (ai — 0.5), the dotted curves the estimates for the second sample (a2 — 2), 
and the dashcd-dottcd curves the estimates for the combined samples. The solid curves give 
the corresponding actual distribution functions for these three situations. The bandwidth for 
the computation of the MSLEs was 6^ = 2N-^^^ « 0.57708, where N = m + n = 500. 

2 A likelihood ratio test, based on maximum smoothed 
likelihood estimators 

In order to avoid problems at the boundary, we restrict the domain on which we compute 
our test statistic to an interval [a,b] C (0, M), where [0,M] is assumed to be the support of 
the underlying densities, corresponding to the distribution fimctions Fi and F2 of the hidden 
variables. We consider the statistic Vn, similar to (1.5), and defined by 

VN='^f (/^.i(*)log|^ + {^.r(i)-/..r(i)}log^p|^| dt 

Jte[a,b\ y FN{t) l-FN(t) J 

+ 1/ ihMt) log + VaMt) - hMt)} log \ - ^^f^l I dt (2.1) 

JK[a,b\ y FN{t) l-FNiTi) J 
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where Fni, Fn2 and F/v are the rnaxirmirn smoothed likehhood estimators (MSLEs) for the 
first, second and combined sample, respectively, and gjvi and hj^i arc kernel estimates of the 
relevant observation densities, defined below. As explained in Groencboom ct al. (2010), where 
the same type of MSLE for the current status model is defined, the MSLEs for the combined 
samples and the first and second sample arc computed by replacing the cusum diagrams (1-2), 
(1,3) and (1,4) by the continuous cusum diagrams 

(GN{t),HN{t)),t€[0,M], (2,2) 

(GNi{t),HNi{t)) ,t€[0,M], (2,3) 

and 

{GN2{t), HN2it)) ,te[0,M], (2,4) 

respectively, where Gjv, Gni, Hn, Hnz and their derivatives are defined in the following way. 
We first define the densities qni and /ijvi on [6jv, M — 6jv] by 

gNi{t)= j Kb^{t-u)dGNi{u), hNi{t) = j Kb,,{t-u)SdFm{u,S), (2,5) 

Here Gjvi is the empirical distribution function of the observations Ti , , , . , of the first 
sample and Pjvi is the empirical distribution of the observations (Ti, Ai), , , . , (T^, A^) of the 
first sample, with the analogous definitions of Gjv2 and Pjv2 for the second sample. The densities 
gN and Jin are defined on [6jv, M — Bn] by 

~ ~ ~ TKl 

9N = OiNQNl + 0NgN2, flN = OiNhNl + PNhN2, ttN = ^ , /3jv = 1 - OtN- 

The kernel is defined in the usual way by 

Kt,{u) = \K{u/h), 

for a bandwidth 6 > 0, where is a symmetric positive kernel with compact support. We con- 
sider symmetric positive polynomial- type kernels K, with compact support. In our simulation 
study we took 

the so-called triweight kernel. 

For t G [0, b^] and t G [M— fe^r, M\ we use a boundary kernel, defined by a linear combination 
of K{ti) and uK{u). Other ways of bias correction at the boundary are also possible, but it seems 
necessary to use such a correction in order to obtain a reasonable behavior at the boundary. 
Using boundary kernels, we lose the simple property that the distribution function can be 
obtained by just integrating the kernel, and indeed the estimates of the distribution functions 
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were obtained by mimerieally integrating the estimates of the densities (and not by integrating 
the kernels). So we define 



and use the corresponding numerical integrals in the continuous cusum diagrams (2.2) to (2.4). 

Note that the cusum diagrams (2.2) to (2.4) are continuous analogues of the cusum diagrams 
(1.2) to (1.4), since, for example, the left-continuous slope of (1.2) is the same as the left- 
continuous slope of the cusum diagram consisting of the set of points 



where is the empirical distribution function of the points Tj, i = 1,...,N, and Hjv is the 
empirical sub-distribution function of the points Ti, i = 1, . . . , N, with Aj = 1. However, the 
slopes of the greatest convex minorants of the continuous cusum diagrams (2.2) to (2.4) are 
continuous functions oit in contrast with the left-continuous slopes of the cusum diagrams (1.2) 



The following result shows that the test statistic Vn is, for a suitable choice of the bandwidth, 
an asymptotic pivot under the null hypothesis of equality of the two distribution function Fi 
and F2 of the hidden variables in the two samples. 

Theorem 2.1 Let the test statistic Vn be defined by (2.1), using a bandwidth b^ such that 
6jv ^ n~", where 2/9 < a < 1/3. Furthermore, let F stay away from zero and one on [a, b\ and 
have a bounded continuous second derivative f on an interval {a',b') containing [a,b], and let 
gi and g2 be continuous densities which stay away from zero on [a,b\, with continuous bounded 
second derivatives on the interval {a',b'). Let the log likelihood ratio statistic Vn, based on the 
MSLEs, be defined by (2.1). Then we have in probability, if the distribution functions of the 
hidden variables in the two sample are both equal to F and m/N — >■ a e (0, 1), as N ^ 00, 




Gn — OLnGni + PnGn2, 



Hn — UnHni + PnHn2^ 



{(GAr(t),HAr(t)),t>0}, 



to (1.4). 




(2.7) 



where N{0,aj^) denotes a normal distribution with mean zero and variance 




Remark 2.1 To say that (2.7) holds in probability means that 
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for each a; G M, where $ is the standard normal distribution function and — > denotes conver- 
gence in probabihty. 

Remark 2.2 The restriction of the bandwidth to the range N-^/^ < &Ar < has the 

fohowing motivation. The condition bff ^ N~^^^ is necessary for having the asymptotic equiv- 
alence of the MSLEs to ratios of kernel estimators (see Corollary 3.4 in Groeneboom et al. 
(2010)), and ^ N~'^^^ prevents the bias to enter, which causes the asymptotic distribution 
of Vn to become dependent on the observation densities gi and 32 • The bias term drops out if 
the observation densities gi and 52 are the same in the two samples. 

Nevertheless, we prefer to work with a larger bandwidth, at the cost of introducing a bias 
term, depending on the underlying distributions, as shown in Theorem 2.2. It turns out that 
this bias term does not bother us, if we compute the critical values by a bootstrap procedure, 
to be discussed in section 4. The key to this is that the bias term is estimated automatically 
in the bootstrap resampling from a smooth estimate of F and that the difference between this 
estimate of the bias and the bias is sufficiently small, as shown in the proof of Theorem 4.1, so 
that we can replace it by the deterministic bias in the central limit theorem for the bootstrap 
test statistic. 

Theorem 2.2 Let the test statistic Vn be defined by (2.1), using a bandwidth bN such that 
bjsr ^ n~", where 1/5 < a < 2/9. Furthermore, let F stay away from zero and one on [a, b] and 
have a bounded continuous second derivative f on an interval {a',b') containing [a,b], and let 
gi and g2 be continuous densities which stay away from zero on [a,b], with continuous bounded 
second derivatives on the interval {a',b'). Let the log likelihood ratio statistic Vn, based on the 
MSLEs, be defined by (2.1). Then we have in probability, if the distribution functions of the 
hidden variables in the two sample are both equal to F and m/N — )• a € (0, 1), as N ^ 00, 

'fJf' "'A' ^ ( / ^A' >tv I 

A., f - f (())a»(t)9i(<)92(i) \J I j 

A]V(0,<r|:), 
where g^ is defined by: 

5jv(i) = aNgi{t) + PN92{t). 

and N{0,a%) denotes a normal distribution with mean zero and variance a% defined as in 
Theorem 2.1. 

Remark 2.3 If 6jv ^ N~^/^ the situation becomes even more complicated. If the observation 
densities gi and 52 are the same, we still get the asymptotic normality result, as shown in the 
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following theorem. But if the densities gi and 92 a-re different, extra non- negligible random 
terms enter because of the presence of the bias term. We will not discuss this further in the 
present paper. 

Theorem 2.3 Let the test statistic Vn be defined by (2.1), using a bandwidth bN such that 
bjsr ^ n~", where 1/5 < a < 1/3. Furthermore, let F stay away from zero and one on [a,b] 
and have a bounded continuous second derivative f on an interval {a',b') containing [a,b], 
and let gi =52 be a continuous density which stays away from zero on [a,b], with a continuous 
bounded second derivative on the interval {a', b'). Then we have in probability, if the distribution 
functions of the hidden variables in the two sample are both equal to F and m/N ^ a G (0,1), 
as N ^ 00, 

^{v^a { 1^^' • • • ' - ]^ / ^ 

where N{Q,cf\) denotes a normal distribution with mean zero and variance aj^ defined as in 
Theorem 2.1. 

Remark 2.4 We used a conditional formulation, since we will use conditional tests in our 
bootstrap approach, but the convergence in distribution will also hold in Theorems 2.1 to 2.3, 
if we do not condition on Ti, . . . , Tjv- 



3 The original LR test 

We return to the original LR test, using the MLEs, and confine ourselves to a heuristic discus- 
sion, since a complete treatment is still out of our grasp. As in the proof of Theorem 2.1, we 
have: 



/ (F.r(t)log%^ + {l-F;vrW}logi^4^| d^^t) 

J\a.b] I FnU) 1 - FNit) I 

dGi{t), 



a.b] [ FN{t) l-FN{t) 

{FN{t)-FNi{t)y 

F{t){l-Fit)} 

with a similar relation for the terms involving Fn2- This motivates the study of integrals of the 
following type: 

{F^{x)-F{x)y 



F(.){i-F(.)} 



The local limit of the MLE of the combined samples under the null hypothesis, when the 
observation times Tj in both samples is given by G is given in the following theorem, given on 
p. 89 of Groeneboom and Wellner (1992). 
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Theorem 3.1 Let be such that < -F(to),G(to) < l- and let F and G be differentiahle at 
t{), luith strictly positive derivatives f{to) and .9 (to); respectively. Furthermore, let Fn be the 
MLE of F under the null hypothesis. Then we have, as N ^ 00, 

N^/^{Fr,{to) - F{to)}/{lF{to){l - i^(to))/(to)/5(*o)}'^' A 2Z, (3.1) 

x> 

where — > denotes convergence in distribution, and where Z is the last time where standard 
two-sided Brownian motion plus the parabola y{t) = reaches its minimum. 

From this one can deduce, under the assumptions of Theorem 2.1, 



N^' E / J- , / ' ^, \'> dGix) ~ N^I^AEZ^ / ^ ' — dx, N ^ 00, 

1 F{x){l-F{x)} ^' Ja UF(x)n-F(x)}V^' 



(3.2) 

where Z is defined as in Theorem 3.1. By Table 4 in Groeneboom and Wellner (1992) we have: 

AEZ'^ « 1.05423856. 

Let Kj\r be the number of jumps of the MLE on the interval [a,b]. Then it follows from 
Groeneboom (2011) that, again under the assumptions of Theorem 2.1, 

/•'' {f(x)^g(x)y^^ , , 

EKn^cN^I"^ \ ' -i^dx,n^oo. 3.3) 

J a (4F(a;){l - F(a;)}) ' 
for a constant c > which is close to 2.1, so we find 

^.0.5 
c 

It is tempting to beheve that this ratio is exactly equal to 1/2, but we have no proof of that. 
It can also be deduced from Groeneboom (2011) that ii'jv is asymptotically normal and that, 
in fact, 

(") 

for a universal constant C2 > 0, not depending on the underlying distributions. 

The intuitive interpretation of all this is that wc have histograms with a random number of 
cells, where, under the null hypothesis 'Ho, the number of cells has an asymptotic expectation 
which is proportional to the asymptotic expectation on the right-hand side of (3.2). Note that 

VKn <{ — } = VEKm <{ — — } + Op(l), 

and that 



r---—(2TN 4EZ^\ r---—( 2Tn \ AEZ^ EKn-Kn^ 
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where c is as in (3.3). Since 

where C2 is defined as in (3.4), it is clear that \/K^ {2Tm /Km — 1} is an asymptotic pivot under 



Ho if and only if y/EK^ {2Tm / EKm — 1} is an asymptotic pivot under 'Ho- 

So the situation is somewhat similar to the situation in section 2, but on the other hand 
much more complicated because of the fact that the MLEs are in fact histogram-type estima- 
tors, where the number of cells of the histograms is random, and because of the fact that the 
estimators Fffi, F]\[2, and F^ are nonlinear estimators which are also asymptotically nonlinear, 
which leads to non-standard limit distributions of the pointwise estimators -FWi(t) and FN{t), 
in contrast with the MSLEs F]\ii{t) and Fjv(t) which have normal limit distributions. Another 
complication is that Fn, F^i and F/v2 have jumps at different locations. 

Nevertheless we want to include this original LR test in our comparisons and we use the 
bootstrap method of section 4 for generating critical values for this test. 



4 A bootstrap method for determining the critical value 

We propose the following method for determining the critical value for testing the null hypothe- 
sis that the two samples come from the same distribution for the likelihood ratio test, discussed 
in section 2. 

First compute a MSLE -Fjvb^ '^'-'^ ^^"^ combined sample as discussed in section 2, using a 
bandwidth bjf x N'^/^. Then, using the observations Ti, . . . , r„j and T^^i, . . . , T/y of the two 
samples, generate corresponding bootstrap values A^, . . . , A*„ and AJ^^^, . . . , A^ by letting 
the A* be independent Bernoulli {Fj^ j^^{Ti)) random variables. So in practice we generate 
quasi-random independent Uniform(0, 1) variables U* by using a random number generator, 
and let A* be equal to 1 if U* < Fj^ (Ti) and zero otherwise. If the observation distributions, 
generating Ti, . . . , Tm and Tm+i, ■ ■ ■ , Tjv, respectively, are different, this structure is preserved 
in this procedure; in the computation of the MSLEs F^^ in the bootstrap samples the estimates 
9Nj of gj in the original samples are used, for j — 1, 2. Repeating this procedure B times, we 
obtain B bootstrap values ^, I <i < B, oi the test statistic. The distribution of Vn imder 
the null hypothesis is now approximated by the empirical distribution of these bootstrap values 
and the critical value at (for example) level 5% by the 95th percentile of this set of bootstrap 
values V^j. 

In justifying this method for our test statistic V/v, we use the following theorem. 
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Theorem 4.1 Let, under either of the conditions of Theorems 2.1 to 2.3, F^^^ he the MSLE 
of F under the null hypothesis, defined by the slope of the cusum diagram (2.2), where the 
bandwidth satisfies ^ N~^^^. Let be defined by 

V^ = 2 f [h*^,{t) log ^ + {gMt) - h*^M log \ ~ ^tl'l I '^^ 

+ 2 / [h*^,{t) log ^ + {~g^,{t) - h*^,{t)] log \ - ^1^^^ ] dt (4.1) 
JteWM y F*^{t) l-F*^{Ti) J 

where F^, F^-^ and F^2 o-f^ MSLEs, computed for the samples (Ti, Af ), . . . , (T^, A^) and 
(T^+i, A^_,_^), . . . , (Tjv, A^), and where the A* are Bernoulli (F;v,6N(^j)) random variables, 
generated in the way described before the statement of this theorem; g^i and h%^ are kernel 
estimates of the relevant observation densities, just as in section 2, where 

m N 

h*Ni{t) = m-'Y.AlK,,{t-T,), h*^2{t) = n-' ^ A*K,,{t-Ti). 

i=l i=m+l 

with the same bandwidth b^ as taken in the original samples, and where the densities g^i and 
gN2 are the same as in the original samples. 

Then we get under Ho that the conditional distribution function of V^, given (Ti, Ai), . . . , 
(TjvjAat), resettled in the same way as in Theorems 2.1 to 2.3 (depending on the choice of 
bandwidth and presence or absence of the condition gi = g^), converges at each a; e K m 
probability to the standard normal distribution function ^{x). 

The proof of this result is given in the appendix. If the null hypothesis does not hold, we 
follow the same scheme. The critical value is again determined by first computing the A|, using 
the MSLE F^^ , based on the combined sample. 

For the MLEs of section 3 we follow a similar procedure, although we presently cannot 
justify this with a result analogous to Theorem 4.1. However, the A*'s are computed by using 
the MSLE Fj^^^^, based on the original combined sample, using a bandwidth b^ ^ N~'^/^, 
instead of the ordinary MLE for this sample. This seems to work better for the sample sizes we 
used in the simulations. For these distributions, the MSLE converges at the local rate N~'^/^ 
instead of MLE itself, which has local rate N~^^^, and this led to a better estimate of the level 
under the null hypothesis, which was taken to be 0.05. Bootstrap estimates, based on the MLE 
instead of the MSLE, which we also computed, exhibited a very anti-conservative behavior for 
certain combinations of the parameters, sometimes leading to estimates of the levels which were 
twice the intended level. 
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5 Other nonparametric tests 



Most test which have been proposed for this problem are based on a comparison of simple 
functional of the Aj. Under the assumption that the observation times Tj have the same 
distribution in the two samples, the following test statistic is proposed in Sun (2006): 

N m N 



i=l 2=1 z=m+l 

where we take = 1 if the observation belongs to the first sample and Zi — Q\i the observation 
belongs to the second sample in the notation of Sun (2006), p. 76, and where Z = X^ili Zi/N. 

It is stated in Sun (2006) that the variance of Ar-i/2 times (5.1) is given by the random 
variable 

{m iV ~j 

Y^Pi^u E • (5-2) 

Apart from the fact that the variance then is a random variable, we have more difficulties in 
interpreting this, since we get, if ajv — a € (0, 1) and — )• /3 = 1 — a, 

N-HY,Pl^}+ E «^aA ^Q^j/? / F{t)dGr{t) + a j F{t)dG2{t)\ 
= aP J F{t) dG{t), 
if Gi = G2 = G. But the actual variance of N~^/^ times (5.1) is given by: 

aN/3N J F{t) dG{t) |l - y" F{t) dG(t)| , (5.3) 

if Gi = G2 = G. So the proposed estimate of the variance in Sun (2006) will severely over- 
estimate the actual variance, and the proposed normalization will not give a standard normal 
distribution in the limit, as claimed in Sun (2006). 

Also, considering the Zi as i.i.d. random variables, as in Sun (2006), where Zi is a Bernoulli 
random variable with 

P{Zi = 1} = ajv, P{^i = 0} = ;3;v = 1 - a;v, 

and where the Z^ are independent of the observation times Tj and the indicators Aj, we arrive 
at (5.3) instead of (5.2) as the approximate variance of N~^/'^ times (5.1). This is seen in the 
following way. 

We can write, under the null hypothesis that the Aj have the same distribution, and also 
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under the restriction that the observations Tj have the same distribution in the two samples, 

JV JV 



i=l i=l 

JV JV 
i=l i=l 

using EAi = EAi for each i. This yields: 

TV- Var (C/e^) ~ var ((Zi - a^) (Ai - EA^)) = aN^N J F{t) dG(t) \l- j F{t) dG{t)^ , 

since the second expression on the right-hand side of (5.4) gives a contribution of lower order. 
So we arrive (not surprisingly) again at (5.3) as an approximation of the variance of n~^/'^Ucw 
in the interpretation of the Zi as i.i.d. random variables, implying that the acw suggested as 
standardization of the statistic N~^/'^Ucw in Sun (2006) in the last line of the first paragraph of 
section 4.2.1.1, has to be replaced by an estimate of the square root of (5.3), also if we consider 
the Zi to be random. The mistake of taking (5.2) as an estimate of the variance is probably 
caused by ignoring the dependence of the terms {Zi — Z)Ai, caused by Z, and treating Z as if 
it were EZi. The presence of Z actually has a variance diminishing effect. 

Putting these difficulties aside, and not using the standardization by the square root of 
(5.2), we could of course consider the test statistic 

{m JV ~j 

i=l i=m+l ) 

which has expectation zero under the null hypothesis, provided Gi = G2, and variance (5.3), if 
G\ = G2 = G. Then, since the MLE Fjv, based on the combined samples, satisfies, under some 
regularity conditions, 



j Fj,{t)dGN{t) ^ j F{t)dG{t), 



where F is the limit (mixture) distribution of the combined samples (which is the underlying 
distribution under Ho), we could use as test statistic 

= (5.6) 

CJV 

where C/„ is defined by (5.5), and where 

CT^r = aN^N j FN{t) rfG;v(t) |l - J FN{t) rfGjv(t)| , (5.7) 

Then Un tends to a standard normal distribution under the null hypothesis, if Gi = G2 = G. 
We note that in Sun (2006) also a test where Gi ^ G2 is allowed is discussed, but since this 
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test is connected to a specific parametric model, it is not a test of the fully nonparametric type 
wc consider here. 

Andersen and R0nn (1995) consider a test based on 



Wn = 



on an interval [0, a], where Wn is asymptotically standard normal under the null hypothesis, 
if Gi = G2 (note that in their definition of this test statistic, which is denoted by W on p. 
325, a factor ^/n is missing in the numerator). They rely in their proof on the master's thesis 
Hansen (1911), which, incidentally, was written at Delft University of Technology, and not at 
the University of Copenhagen, as stated in Andersen and R0nn (1995). 
Under the conditions of Theorem 2.1 we have: 



^7V(0,1), (5.8) 



under where iV(0, 1) is the standard normal distribution. A sketch of how this result can 
be derived, roughly using the techniques developed in Hansen (1911), is given in the appendix. 



6 A simulation study 

In this section we compare the LR test based on the MSLEs and the real LR test with the 
methods, discussed in the preceding section. In our comparison we use the same WeibuU 
model, which was used in the comparison, given in Andersen and R0nn (1995). In determining 
the critical levels and the powers of the tests, based on V/v (the test statistic based on the 
MSLEs) and the LR test, based on the MLEs, we used the method described in section 4, that 
is, the critical values were determined by (Bernoulli) bootstrapping the Aj, using the MSLE 
Fj^l^{Ti) for the combined samples at the observations Tj, by taking 1000 bootstrap samples 
and determining the 95th percentile of the bootstrap test statistics, so obtained. 

As the bandwidth for smoothing the MLE Fn, we used bjy = 2N^^/^ in all instances, 
and we used the kernel (2.6) in computing Fat, as described in section 2. As the observation 
densities gi and 92 for the observation times Ti we took the uniform densities on [0, 2], just as 
in Andersen and R0nn (1995). Note that in the simulation study of Andersen and R0nn (1995) 
gi — gi, so we can apply Theorem 2.3. This allowed us to resample from the MSLE Jjv, which 
was also used in the computation of the test statistic for the original samples. 

The powers and levels computed below for the test statistics Vjv (MSLEs) and the LR statis- 
tic, based on the MLEs, are determined by taking 1000 samples from the original distributions 
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and taking 1000 bootstrap sample from each sample, rejecting the null hypothesis if the value 
in the original sample was larger than the 950th order statistie of the values obtained in the 
bootstrap samples. The values given in the tables below represent the fraction of rejections for 
the 1000 samples from the original distributions. The simulation were carried out using a C 
program, which was written by the author specifically for this analysis. 

We also included the estimates, discussed in section 5, where Wn denotes the test statistic 
of Andersen and R0nn (1995) and Um denotes the test statistic of Sun (2006), but with the 
incorrect estimate of the variance (5.2) in Sun (2006) replaced by (5.7). In this case we just 
took 1.96 as our critical value for the absolute value of the test statistic, since the convergence 
to the standard normal distribution is reasonably fast for these test statistics under the null 
hypothesis. In this way one can rather fastly compute tables of this type for these test statistics, 
which was again done by writing a C program for this purpose. The tabled values are again 
based on 1000 samples from the original (Weibull) distributions. 

Using the same parametrization as in Andersen and R0nn (1995), we generated the first 
sample from the density 

aiAa;"i-^e-^^"\ a; > 0, (6.1) 
and the second sample from the density 

asAto^^-^e"^^^"' , a; > 0, (6.2) 

where A = 1.6 or A = 0.58, and ai = 0.5, 1.0 or 2.0. The value of 6 is 1, 1.25 or 2. Why these 
specific values were taken in Andersen and R0nn (1995) is not clear to me, but I take the same 
values for an easy comparison with the work, reported in their paper. I have to note, though, 
that for a, = 0.5 the Weibull density is unbounded near zero, and that then the results of 
Hansen (1911) are not valid on [0,2], since one of the conditions in her thesis was that this 
density is bounded on the interval of interest. This is also one of the reasons that the interval 
[0, 2], used in Andersen and R0nn (1995), was shrunk to [0.1, 1.9] in our simulation study, since 
the density is bounded on this interval. 

To illustrate the effect of different observation distributions in the two samples, we generated 
the first sample of Tj's again from the uniform density on [0,2], but the second sample from 
the decreasing density 

52(t) = i(2-^)^^e [0,2], 

see Tables 2 and 4. Note that in this case Theorem 2.3 does not apply, and we would actually 
have to use Theorem 2.1 or 2.2. Nevertheless, we just proceeded in the same way as for the 
simulations for the situation gi = g2, and Tables 2 and 4 show that the test based on the MSLEs, 



16 



where we take fejv = 2iV and compute the critical vahies using the bootstrap procedure, 
were rather insensitive to the difference of the observation distributions Gi and G2. 

Table 1: Estimated levels. The estimation interval is [0.1,1.9], and m = n = 50; = |, 
52 (t) = \, Q!i = oi2- The intended level is a = 0.05. 



91 =92 


X,ai 


Under Hq 


m = n = 50 


1.6,0.5 


1.6,1.0 


1.6,2.0 


0.58,0.5 


0.58,1.0 


0.58,2.0 


SLR test 


0.041 


0.058 


0.045 


0.049 


0.049 


0.059 


LR test 


0.045 


0.051 


0.041 


0.052 


0.046 


0.055 


Un 


0.050 


0.060 


0.047 


0.054 


0.058 


0.052 


Wn 


0.055 


0.066 


0.087 


0.061 


0.061 


0.072 



Table 2: Estimated levels. The estimation interval is [0.1,1.9], and m = n = 50; gi{t) = 2, 
92{t) = j(2 - t)^, ai = a2- The intended level is a = 0.05. 



52(t)-i(2-tf 


A, a. 


Under Hq 


m = n = 50 


1.6,0.5 


1.6,1.0 


1.6,2.0 


0.58,0.5 


0.58, 1.0 


0.58,2.0 


SLR test 


0.049 


0.051 


0.045 


0.049 


0.049 


0.059 


LR test 


0.051 


0.055 


0.049 


0.044 


0.050 


0.056 


Un 


0.422 


0.745 


0.950 


0.262 


0.540 


0.885 


Wn 


0.122 


0.108 


0.130 


0.326 


0.302 


0.276 



Table 3: Estimated levels. The estimation interval is [0.1, 1.9], and m = n = 250; 9i{t) = g, 
92{t) = |, ai =0.2- The intended level is a = 0.05. 



9i = 92, 


A, Ckj 


Under Hq 


m = n = 250 


1.6,0.5 


1.6,1.0 


1.6,2.0 


0.58,0.5 


0.58,1.0 


0.58,2.0 


SLR test 


0.051 


0.049 


0.052 


0.053 


0.032 


0.040 


LR test 


0.048 


0.049 


0.059 


0.053 


0.045 


0.054 


Un 


0.050 


0.060 


0.047 


0.054 


0.058 


0.052 


Wn 


0.055 


0.066 


0.087 


0.061 


0.061 


0.072 



The results of our experiments can be summarized in the following way. The corrected 
version of the test statistic discussed in Sun (2006), denoted hy Un here, has almost no power 
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Table 4: Estimated levels. The estimation interval is [0.1, 1.9], and m = n = 250. The intended 
level is a = 0.05; gt{t) = \, g2{t) = \{2 - t)\ at = a^. 





A, a.i 


Under Hq 


m, = n = 250 


1.6,0.5 


1.6,1.0 


1.6,2.0 


0.58,0.5 


0.58,1.0 


0.58,2.0 


SLR test 


0.044 


0.050 


0.051 


0.049 


0.044 


0.051 


LR test 


0.045 


0.051 


0.041 


0.052 


0.054 


0.058 


Un 


0.970 


1.000 


1.000 


0.840 


0.996 


1.000 


Wn 


0.181 


0.135 


0.102 


0.513 


0.491 


0.410 



Table 5: Powers for different shapes, if m = n = 50. The estimation interval is [0.1, 1.9]. 



51 = 92 


A,ai,Q!2 


Different shapes 


m = n = 50 


1.6,0.5,1.0 


1.6,0.5,2.0 


0.58,0.5,2.0 


0.58,1.0,2.0 


SLR test 


0.174 


0.675 


0.470 


0.207 


LR test 


0.125 


0.533 


0.364 


0.173 


Un 


0.061 


0.069 


0.045 


0.053 


Wn 


0.062 


0.110 


0.179 


0.146 



Table 6: Powers for different shapes, if m = n = 250. The estimation interval is [0.1, 1.9]. 



91 =92 


\,ai,a2 


Different shapes 


m = n = 250 


1.6,0.5,1.0 


1.6,0.5,2.0 


0.58,0.5,2.0 


0.58,1.0,2.0 


SLR test 


0.606 


1.000 


0.990 


0.787 


LR test 


0.440 


1.000 


0.974 


0.610 


Un 


0.076 


0.132 


0.062 


0.076 


Wn 


0.088 


0.112 


0.583 


0.406 



for different shape alternatives of the type shown in Figure 1, even for sample sizes m = n = 250. 
The test proposed by Andersen and R0nn (1995), denoted by Wn, has somewhat more power 
here, but is clearly also not very good for this type of alternative, as already discussed in 
Andersen and R0nn (1995) (they call this the "crossing alternatives", since the distribution 
functions indeed cross). Both the test based on the MSLEs and the test, based on the MLEs, 
have more power here. The test, based on Wn, is surprisingly powerful for the alternatives 
which have the same shape but different baseline hazards, and the test, based on Un also 
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Table 7: Powers for different baseline hazards, same shape, if m = n = 50. The estimation 
interval is [0.1,1.9]. The parameters ctj are either both 0.5 or both 2 and A = 1.6 or 0.58; 
e = 1.25,1.5 or 2. 



9i = 92 


A, tti, 6 


Different baseUne hazards 


m = n = 50 


1.6,0.5,1.25 


1.6,0.5,1.5 


1.6,0.5,2 


0.58,2,1.25 


0.58,2,1.5 


0.58,2,2 


SLR test 


0.138 


0.283 


0.632 


0.091 


0.208 


0.480 


LR test 


0.097 


0.218 


0.498 


0.082 


0.171 


0.342 


Un 


0.108 


0.198 


0.441 


0.100 


0.151 


0.333 


Wn 


0.147 


0.352 


1.000 


0.103 


0.293 


0.681 



Table 8: Powers for different l^aseiine hazards, same shape, if m = n = 250. The estimation 
interval is [0.1,1.9]. The parameters ai are either both 0.5 or both 2 and A = 1.6 or 0.58; 
= 1.25, 1.5 or 2. 



<Jl = !J2 


\.a,.e 


Dilfbrcut l)aseliiie hazards 


m = n = 250 


1.6,0.5,1.25 


1.6,0.5,1.5 


1.6,0.5,2 


0.58,2,1.25 


0.58,2,1.5 


0.58,2,2 


SLR test 


0.377 


0.873 


1.000 


0.227 


0.689 


0.995 


LR test 


0.246 


0.728 


0.996 


0.171 


0.505 


0.964 


Un 


0.324 


0.721 


0.971 


0.200 


0.495 


0.921 


Wn 


0.473 


0.912 


1.000 


0.337 


0.835 


1.000 



has more power here. The other tests, based on the MSLEs and MLEs, have also reasonable 
power here, in particular the test based on the MSLEs. Finally, Tables 2 and 4 show that the 
observation distributions in the two samples can be different if we use the LR-type tests, in 
contrast with the other tests, considered here. In fact, it has a disastrous effect for the tests 
Un and Wn] Un even gives 100% rejection under the null hypothesis for several combinations 
of the parameters. 

As noted in the introduction, one could try to use a permutation distribution approach in 
estimating the levels of the tests under the null hypothesis, also when the observation distri- 
butions are different. This does not seem to make much sense for the tests, based on Un and 
Wn, but could possibly be of use for the tests, based on the MSLEs and MLEs. We did some 
experiments in this direction for the Weibull distributions of the simulation study, with rather 
bad results for our sample sizes m = n = 50 and m = n = 250. The general finding is that 
the test based on the MLEs becomes very conservative, whereas the estimates of the levels 
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for the tests based on the MSLEs become too variable to be of any use. In the latter case 
one big differenee with the approach using the bootstrapped is that for the approach using 
the permutation distribution, the densities gi and 92 have to be estimated anew for every new 
permutation of the variables (Ti, Ai), . . . , (T/v, A^r), whereas these estimates can be held fixed 
in the bootstrap approach. This probably leads to a higher variability of the values of the test 
statistic under the null hypothesis for the permutation approach, leading to unstable estimates 
of the levels. However, when the observation distributions are the same in the two samples, the 
permutation procedure seems to work fine, and then gives the same results as the bootstrap 
procedure. 

As a general rule one can say that the tests, based on C/jv or Wn, can only have power if 
the corresponding moment functionals are different from zero. For C/jv this functional is given 

by 



and for Wn it is given by 



/ 

J a 



b 

{Fi{t)-F2{t)}dG{t), (6.3) 



\F,{tf-F2{tf}dGit). {6.4 



It is clear that Fi and F2 can be very different and still satisfy 

f{F^{t)-F2{t)}dG{t) = Q, or f{F^{tf-F2{tf}dG{t)=Q 

J a J a 

and in that case that tests, based on Un or Wn. respectively, will have no power. The LR tests 
will not suffer from this drawback, since they involve a Kiillback-Leibler type distance, and are 
locally (for example if one would consider contiguous alternatives) equivalent to the squared 
L2-distance 

{F^{t)-F{t)Y ,^ ... , /" {F2{t)-F{t)Y ,^ 

i Fit){i-Fm w + 1 Fit){i-Fm '""'^'^^ ^'-'^ 

where F is the distribution function of the combined sample. Moreover, they allow the obser- 
vation distributions to be different in the two samples, something the other test also do not 
allow. 

The WeibuU alternatives, considered in the simulation study, form a family for which the 
integrals, corresponding to the statistics U n and Wn are different under the alternatives, con- 
sidered there. So for these type alternatives the tests Un and Wn can be expected to have 
a power exceeding the level of the test. But if the first sample is generated from a WeibuU 
distribution function Fi with parameters a = 0.5 and A = 0.7 and the second sample is gen- 
erated from a Weibull distribution function F2 with parameters a = 1.8153 and A = 0.7, the 
distribution functions are very different (see Figure 2), although we get: 

i-b 

{Fi{t)-F2{t)} dt!^ -1.87-10-^, a = 0.1, 6=1.9. 
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Taking again the observations Gi and G2 to be uniform on [0, 2], we get that the test based on 
the MSLE has power 0.993 for this ahernative, whereas the tests based on Un has power 0.048 
(which is lower than the level 0.05). 

I.O p 

0.8- 




r- " , , \ , , , , \ , , , , \ , , , , L 

0.0 0.5 1.0 1.5 2.0 



Figure 2: The Weibull distribution function with parameters a = 0.5 and A — 0.7 (solid curve) 
and the Weibull distribution function with parameters a — 1.8153 and A = 0.7 (dashed). 

If the first sample is generated from a Weibull distribution function Fi with parameters 
a = 0.2 and A — 0.8 and the second sample is generated from a Weibull distribution function 
F2 with parameters a — 0.767 and A — 0.8, the distribution functions are again rather different 
(see Figure 3), although we get: 

b 

{Fi{tf ~ F2{t)^} dtKi 2.6-10-^, a ==0.1, 6=1.9. 

Taking gi = g2 = (l/2)l[o_2] again, the test based on the MSLE has power 0.713 for this 
alternative, whereas the tests based on Wn has power 0.041 (which is again lower than the 
level 0.05). 

The LR tests, based on the MLEs instead of the MSLEs, has powers 0.964 and 0.515, 
respectively, for these alternatives, taking the sample sizes m ~ n — 250 again. 

7 Concluding remarks 

In the preceding, two fully nonparametric tests for the two-sample problem for current status 
data were discussed. The tests allow the observation distributions for the two samples to be 
different, and will be consistent for any situation where (6.5) will be different from zero and the 
distributions satisfy some regularity conditions. For the test, based on the maximum smoothed 
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0.8 r 




0.2 - 

/ 



0.0 0.5 1.0 1.5 2.0 

Figure 3: The Weibull distribution function with parameters a = 0.2 and A = 0.8 (solid curve) 
and the WeibuU distribution function with parameters a = 0.767 and A = 0.8 (dashed). 

Hkehhood estimators (MSLEs), the theory is more complete than for the test, based on the 
MLEs, but we suggest a bootstrap method for determining critical values for the latter test, 
which seemed to work well in the simulation study we conducted. 

Most tests which have been proposed for this problem rely on specific functionals, such as 
(6.3) or (6.4), which can easily be zero, while the distributions Fi and F2 are very different. If 
these functionals are zero, the tests cannot be expected to have power against these alternatives. 
A simulation study in section 6, using a Weibull model, which was also used in Andersen and 
R0nn (1995), further illustrates this point. 

The convergence to normality in Theorems 2.1 to 2.3 cannot be expected to be very fast. 
This phenomenon is well-known from the theory of integrated mean squared errors of density 
estimators. However, the bootstrap procedure we propose for estimating the critical values of 
the tests, discussed in section 4 seems to work well, even for sample sizes m = n = 50. So, 
for practical purposes, we advise to use this procedure for estimating the critical values of the 
tests, instead of relying on the asymptotic normality under the null hypothesis. 

We have chosen to work with conditional tests, and in this approach we only have to resample 
the Ai in estimating the critical value for the tests. It is also possible to work with unconditional 
tests, but in that case one also has to resample the Ti from estimates of the densities gi and 
(72 for the first and second sample, respectively. Preliminary experiments with this procedure 
indicate that the resulting powers are roughly the same for the model, used in the simulation 
section 6, but more research is needed to evaluate the two approaches. 
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8 Appendix 



Lemma 8.1 Let either of the conditions of Theorems 2.1 to 2.3 he satisfied. Then 



sup \~gNj{t) - 9At)\ = Op (iV-(i-«)/2 Vb^) 



te[a,5] 



and 



sup 

te[a,b] 



(8.1) 



(8.2) 



hNi{t) - F{t)gj{t)\ = Op (iV-(i-«)/2^/b^) , j = 1,2, 
implying that also: 

sup FNjit)-F{t) =Op(n-(i-")/2ybiiv) , j = l,2. 

te[a,b] ^ ' 

Proof. By Corollary 3.4 in Groeneboom et al. (2010) we have, with probability tending to 



one, 



^m(t) = ^^,te[a,6], 
9N\\t) 



(8.3) 



that is, the MSLE is just equal to the ratio of two kernel estimators for f G [a, &] , with probability 
tending to one. Similarly, with probability tending to one, 

hi^iit) 



FN2{t) 



gN2{t) 



, t e [a,b], 



and 



-FV(i) = z — „ ^ — T-r, t e [a, 6], a„ = m/N, j3j^ = l- un. 



(8.4) 



(8.5) 



Hence we assume in the following that Fn, Fjvi and Fjv2 have the representations (8.3), (8.4) 
and (8.5), respectively. 

We consider the set of functions 



= < (/) : 4>[x, u \ t,h) — K 



t — u 
h 



i[o,M](a;), t e [o,6], u > 0, /i e (0,c] 



(8.6) 



where 0<c<(l/2) min[a, M—b] , where M is the smallest number such that min{Fi (M) , F^ (M) } = 
1. The kernels, considered in this paper (see section 2) satisfy the condition (Kl) of Gine and 
Guillou (2002), p. 911, implying that is a bounded VC class of measurable functions. Fur- 
thermore, 

2 



var {(j>{Xi,Ti \t,h)) =var [K 



t-Ti 
h 



Al 



F{u){l- F{u)}K 



t — u 
h 



9i (u) du 



h F{t- hw){l - F{t - hw)}K{wfgi{t - hw) dw < cK{Of sup gi{t). 

J te[a-h,b+h] 
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Letting (7 and U be defined as in Corollary 2.2 of Gine and Guillou (2002), we get from (2.^ 
in this corollary the following inequality, based on Talagrand (1994) and Talagrand (1996), 



>cJml„g(™) 



(8.7) 



where L and C are positive constants depending on the VC characteristics of the class and 
where a, specialized to our situation, is given by 

a = K{0)(c sup gi{t)] . 

\ te[a-c,b+c] J 

Since we take the bandwidth 6jv of order 6jv >^ N~"', we get from (8.7), taking c in (8.6) 
also of order 0{n~"), 



sup 

te[a,b] 



-1 J2 {t - Ti) Ai - EKb^ {t - Ti) Ai 

i=l 

Op(iv-(i-")/2^toiiv). 



1 i^(0) 
log 



Nbn 



Since we get directly from Theorem 2.3 in Cine and Guillou (2002) that 

sup \~gNi{t) - EKb^ {t - Ti)| = Op (iV-(i-")/2 7logiv) . 
te[o,6] ^ ^ 

It now follows from (8.3), which holds with probability tending to one, that also 



sup 

te[a,b] 



FNl{t) 



Op (7V-(i-«)/2v/logiv) . 



By the conditions of Theorem 2.1 we also have: 

EKh^ [t - Ti) Ai = y K^^ {t - u) F{u)g,{u) du = F{t)gr{t) + O , 

and 

EK^^ {t-T,)=g,{t)+0{N-^''), 
uniformly for t £ [a, b] . Hence we obtain: 

sup FNi{t)-F{t) = Op (n-(i-«)/2 Vloglv) . 

te[a,b] ^ ^ 

The other relations are proved in a similar way. 



□ 
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T/ a f {9N2{t)hNi{t) - gNi{t)hN2{t)} , ^ /„_3(i_a)/2n An3M ro o\ 



Lemma 8.2 Lei either of the conditions of Theorems 2.1 to 2.3 be satisfied. Then 
{gN2{t)hNi{t) - gNi{t%N2{t)y 

'te[a,b] 

where 

9n = oiNgi + /3jvS'2- 

Moreover, 



JtelaM Ht){l-m}9Nit)gi{t)g2{t) NbN J 

1 



te[a,b] 

F{t){l - F{t)}gN{t)giit)g2it) NbN 

AN + BN-Cr, + DN + Op(^^^] , (8.9) 



NVb 



'N 



where 



An='^ ^ {A.-i.(T.)}{A,-i.(T,)}r 



dt, 



_ 2ajvAv V- prT^\/A P(tM [ .91 (^)-^&iv - Ti)Kb„ {t - Tj) 



m N 



it=a { 1 - -P^(i) }5iv (i)5i (<)P2 (t) U J 
TVote f/iat Djv = if gi = 92- 



Proof. By Lemma 8.1 and an expansion of the logarithm we get: 

2 / [hMt) log ^ + {gMt) - hMt)} log \ ~ ^^^/^^^^ I rft 

ite[a,b] [ FNit) l-FN(t) J 

= -2 / |/,^,(t)log|^ + {5;vi(i)-/im(i)}logf^|^| dt 
Jte[a,b] [ FNi{t) l-i^jvi(i)J 

= / t'ml^'m'^mf + (A.-(-")/^(logA^)3/^) . 

We hkewise get, with probabiUty tending to one, 

2 / (/^..(i)log|^ + {^..(i)-/.;v2(i)}log^p^| cit 

= / Tml^'r^'^mf + (A.-(-")/^(logAr)3/^) . 

Jtela.b] hN2{t){gN2[t) - hN2[t)} ^ ' 
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So we have to consider 

f ~9Ni{ty{FNi{t)-FNit)}^ _ 

ctN I — p = ^gNi(t)dt 

[a,b] flNl (i) [gNl {t) - hNl (t) } 



/ te [a,fe] hN2 (t) {gN2 (t) - hN2 (t) } 



f ~gN2{tr{FN2it)-FNit)} . 

+ I^N ^ — ttvt:: — 77; — f — j^gN2{t)dt 



ra f {gN2{t)hNi{t) - gNi{t)hN2{t)}^ ^ 

= "^^^ / I u\S~ m I M\~ M2 
Jtela,b] hNi{t){gNi{t) - hNi{t)}gN{ty 



^ 2« /■ {9N2{t)hNl{t) - gNl{t)hN2{t)} . 

Jte[a,b] hN2{t){gN2{t) - hN2{t)}gN{ty 



We have: 



PNgNljt) I aNgN2{t) 

hNi{t){gNi{t) - hNi{t)}gN{ty hN2{t){gN2{t) - hN2{t)}gN{ty 

_ 0NgNl{t)hN2{t){gN2{t) - hN2{t)} + aNgN2{t)hNi{t){gNi{t) - /ijvi(i)} 
hNi{t){gNi{t) - hNi{t)}hN2{t){gN2{t) - hN2{t)}gN{ty 

^ o, (7v-(i-«)/2v/i^) . 



Hence: 



F{t){l-F{t)}gNit)g,{t)g2{t) 

{gN2{t)hNl{t) - gNl{t)hN2{t)Y 



.r of {gN2{t)hNl[t) - gNl{t)hN2[t)} , ^ /Ar-3(l-a)/2/, AnsM 

Jte[a,b] F{t){l- F{t)}gN{t)gi{t)g2{t) " V J 

Furthermore, 

gN2{t)hNl{t) - gNl{t)hN2{t) 

N m 

i—m-\-l i—1 

m n 

i—1 2— m+1 

TV m 

= n-i ^ if(,„(t-Ti)m-i^{F(Ti)-i^(t)}i^f,„(i-ri) 

i—m-\-l i—1 

rn 'II 

-m-^Y.^bAt-Ti)n-^ ^ {F{Ti)-F{t)}Kt,^{t-Ti) 

i=l i=m-\-l 

N m 



m-^Y.^bAt-Ti)n-^ Y {\-F{Ti)}Kb^{t-Ti). 

i=l i=m+l 
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We first consider the first two terms on tlie right-hand side: 

JV TO 

z=m+l i=l 

m n 

-m-i^i^6^(i-Ti)n-i ^ {F{Ti)-F{t)}Kt,At-Ti) 

i=l i=m+l 

m 

= 9N2{t) m-^ J2{F{Ti) - F{t)}Kt,At - Ti) 

n 

-9Ni{t)n-' {F{Ti)-F{t)}Kt^{t-Ti) 

i—m-\-l 

{m n ^ 

gN2{t)m-^Y,{Ti-t}Kb^{t-Ti)-gNi{t)n-^ ^ {Ti - t}Kb^{t - Ti)[ 
i=l i=m-\-l ) 

m 

+ i5JV2(i) m-i ^ f{ei){Ti - tyK,^{t - T,) 

i=l 

n 

-55ivi(i)n-i ^ /'(0i){T,-O'i^b„(i-T;), 

i=m-t-l 

where 6i is a point between t and Tj. This impUes, using the fact that the variance is of order 
0{N-^bN), 

N TO 

i— m+1 i=l 

m n 

i — 1 2— m+1 

= b%f{t) {giiMit) - g[{t)g2{t)} ! u^K{u) du 



+ ib%m {g'imit) - g'i{t)g'm { / w'^^w 



= b%f{t) {gi{t)9',{t) - g[{t)g2{t)} J u'K{u) du + O, U^J^^^\ + O {h%) , (8.10) 



uniformly for t € [a, b] . 
We now define 



SN{t) = blf{t) {gi{t)g'^{t) - g[{t)g2{t)] j u^K{u) du. (8.11) 
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and 

m 

WN{t) = ~9N2{t)m-^ ^{Ai - F{Ti)]K^^{t - T^) 

n 

-gNi{t)n-^ J2 {Ai-F{Ti)}Kt^{t-Ti). (8.12) 

Then 

E{WN{t)\T,,..., Tjv) = 0, var (VFjvW) = O (^^) , 

and hence: 



{gN2{t)hNi{t) - gNi{t)hN2{t)y = {WN{t) + + Op (^^^) + Op (6' 

We have: 

LiaM - F{t)}Mt)9i{t)92{t) - ^Hx/A/ 

since, by the central hmit theorem 

Jte[aM - F{t)}Mt)9i{t)92{t) H J' 

Note that this term is zero if gi = 92- 
So we get: 

„ f {gN2{t)hNi{t) - gNi{t)hN2{t)y 
""^^"^ JteiaM F{t){l-F{t)}g^{t)g,{t)g2{t) 

Jte[a,b] ^{t)i^ 



F{t)}gN{t)gi{t)g2{t) 



+ Op[^\+Op('^\+Op{b%), (8.13) 



where Djv is defined as in the formulation of the lemma, and where the term Op (b%/\/N^ is 
absent if gi = g2- Let 

m 

WNi{t) = gN2{t) ^{Ai - F{Ti)}Ki,,{t - T,), 

and 

n 

WN2{t)=gNx{t)n-^ {Ai-F{Ti)}Kb^{t-Ti). 

i— m+1 
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Then, by definition (8.12), W'n — Wjyi + Wn2, and we get: 

"^^"^ Jt^iaM Fim - F{t)}9N{t)gi{t)g2{t) 
= a.P. l^^^^^ — 



F{t)}gN{t)gi{t)g2{t) 



Jte[a,b 



F{t){l-F{t)}gNit)g,it)g2it) 

We now have, using Lemma 8.1 for gN2, 

r, f Wmjt)' 

""^'"^ ' laMFmi-Fit)}gN{t)g^it)g2{t) 



I 



92{t) {m-' TZi KbAt - T.) {A, - F{T,)}Y / 1 

,] Fit){l-Fit)}gM{t)giit) ^ 



^2 l^i^i l^^g,ityg^^t)F{t){l-F{t)r' 



= 1 



5iW5Jv(t)i^W{l-i^W} 

1 



^2^ E {A.-F(T.)}{A,-f(T,)}/ 



+ 0i, 



Moreover, by the central limit theorem, 



^|A, l^^g^^t)-gr,{t)F{t){l-F{t)}''' 

aNpN pv-f A p/tU=^ g2{t)Kh^{t-T,f 



Or, 



and 



=a gi{t)gNit)F{t){l-F{t)}'^* 



lib pQ 

'Ej2E{A,-F{T,)y / 

i=i 

\ L,gdt)-gNit)F{t){l-F{t)} 



dt 
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We similarly get: 



dt 



i=m+l 



g2{t)-gN{t)F{t){l-F{t)} 



.ir+i ^ L^g2{t)-gN{t)F{t){l-F{t)} 



and 



Hence: 



1 \ / I 



i=m+l Jt-a 



g^{t)-gN{t)F{t){l-F{t)} 



dt 



dt J K{uf du + 



NbNJt^a 9N{t) J ^ ' \N 
aNpN\^(. Tprrru'^ 92{t)Kb^(t -Ti)' 



Y,{^,-F{T,)Y 



gi{t)-gN{t)F{t){l-F{t)} 

LI Jt=a 



dt 



aNpN ^^ rA i^/^m^ gi{t)K,„{t - 



..-.92{t)-gN{t)F(t){l-F{t)} 



NbNJt=a 9N{t) J "^KNVb^J 



^71-^ / K{u)'^ du + Or,( — ^ 
NbN J ^ ' "^KNVbN 



The representation (8.9) now follows. □ 



Proofs of Theorem 2.1 to 2.3. By Lemma 8.2, wc only have to study the terms on the 
right-hand side of (8.9). We condition on the values Ti, . . . ^T^. The first term An can be 
written 

m 

where 

2a7v/3iv \-|A. z^/^MTA Z7/^.M r 92{t)KbJt-T,)KbJt-Tj) 

i<j 



= 2- 



h- ~ ^ ^^^ ' ~ ^ '^^ 9.{t)-9Nit)F[t){l-F[t)} 



Letting Tj be the cr-algebra, generated by Fi, . . . , 1^, and be the trivial cr-algebra, we get: 

E{Yj I =0,i = l,...,TO. 
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Furthermore we have, in probabiUty, 



■F{T,){l-F{T,)] \ ^ {A,-i^(rO} f 



2 



gr{t)-gN{t)F{t){l-F{t)} 
where the last relation holds for large j. Hence we get, in probability, 

K{v)K{v + x)dv\ dx 



mm%N Ja 9N{t) 

2^2 rb ^^l,\2 



dt { K{v)K{v + x)dv\ dx 



We use here that (for the Tj being random again); 



var 



" 4(j-i)g2(T,-)^i[._,,,fe+,,](r, -) 

^ m2 -g^{T,Yg,{T,) 



1 f4g2iT,)^l[a-b,,.b+H.]iTi)\ 
= — 7 var ' ' 

TO'* 

implying 



fri rn^ Ja 9N{tY to2 y„ 5jv(i)2 -g^^^y 

By similar methods we can extend this to the indices j = to + 1, . . . , A'', where 



+ 



mn 



" ^^^^ ^' " ™ Mt)F{t){l-F[t)} 



which also involves the terms and Cn, and results in: 

- 4^ I / ^^'('^O/^^ + x)dXdx f M' + al9.{t)Y 2a.M)9.{t) 



2(6 - a) 



K{v)K{v + x)dv> dx. 
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So we find: 

N 



J^Ef^i^Ny/b^Yjy I ^ 2(6-a) |y K{v)K{v + x) dv^ dx, N 



oo. 







By tedious but straiglitforward computations, using 4th moments of the Bernoulli distribution, 
one can also check that 

^ f 2 ^ 

I (^^^j) hNWf>e} I 1 ^ 0, TV ^ OO. 

The result now follows from the martingale convergence theorem on p. 171 in Pollard (1984). □ 



Sketch of proof of (5.8). First consider 

{P^tf-Fitf} dG{t), 
where we assume Gi = G2 = G. Then: 

y {Prr^itf -F{tf} dG{t) = 2 {F^{t)-F{t)}F{t)dG{t)+ [F^t) - Fit)^ dG{t) 

= 2 £ {Fm{t) - F{t)} F{t) dG{t) + Op . 

Secondly, 

2£ {F„(t) - F{t)}F{t)dG{t) = 2^ {F„(t) - s} F{t) dPoi{t,6), 

where Pqi is the probability measure, generating the random variables (Ti, Ai), . . . , (Tm, A^). 
Let F be a piecewise constant version of F, which is constant on the same intervals as Fm- 
Then: 

2 r{Frr,{t)-d}F{t)dPoi{t,d) 
Jo 

= 2 [ {F^{t)-S}Fo{t)dPoi{t,S) + 2 f {F^{t)-S}{F{t)-Fo{t)}dPoi{t,d) 
Jo Jo 

= 2 r{Fm{t)-5}Fo{t)dPoi{t,S)+2 r{Frn{t)-F{t)}{F{t)-Fo{t)}dG{t) 
Jo Jo 

= 2^ {Fm{t) - S}Fo{t)dPoi{t,S) + Op . 

But, by the characterization of the MLE Fm, we have, if T(a) is the last point of jump of Fm 
before a, 



2 / {Fm{t) - 6}Fo{t)dFNiit,S) = 0, 

J[0,T{a)) 
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and hence: 



2/ {Frr,{t)-6}Fo{t)dPoi{t,6) = 2 [ {Fm{t)-6}Fo{t)d{Poi-¥N,){t,6) 

Jo J[0,T{a)) 

+ 0,(m-2/3) 

= 2 / {F{t) - SjFoit) d (Poi - Pjvi) {t, S) 

J[0,T{a)) 

+ 2 f {Fm{t)-F{t)}Foit)diPoi-¥Ni){t,S)+Op(m-^/'') 

= 2 / {F{t) - 5}F{t) d (Poi - Pivi) {t, 5) + Op (m-2/3) , 
J [0,0] ^ ^ 

where the first term, multiphed by \/rn, is asymptotically normal with mean zero and variance 

4 / F(tf{l-F{t)}dG{t). 
Jo 

This implies the result, since we can write: 

^ {F„,{tf - F„{tf} dGN{t) 

= r {P^itf - F^itf} dG{t) + [ {P^itf - F^itf} d(Gjv - G) it) 
= £ {Pmitf - F{tf} dG{t) - £ {Pnitf - F{tf} dG{t) + Op (iV-2/3) , 



/o Jo 
and since and F„ are based on independent samples 



□ 



Proof of Theorem 4.1. We may assume that, for large N , Fj^ has the representation 

for t e [a, 6], where 6]v ^ N^^^^. This gives 

/ S^^ it - u) dV^iu, S) g'^ ~^^ (t) J SK-,^ {t - u) dF^iu, 5) 



where 



and 



~9N,b. it) =j K~,Jt-u) dGN (u) , ~g'^ -^^ (t) = J K'^^ {t-u)dQN{u), 



By the assumptions on g, and using 6jv ^ n~^/^, we have 



sup 



~9N:bJt) - 9N{t) = Op (iV-2/5y1ogn) and sup ~g'^ ~^Jt) - g' {t) 

ie[a,6] ' 



= Op (AT-i/s^iogn) 
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uniformly for t € [a,b]. Furthermore, since 

1 ^ 



t-T, 



we get: 



J SK'^^ {t -u)dFN{u,S)- J K'^^ [t - u)F{u) dG{u) 

= j {5-F{u)}^^{t-u) (JFn{u,5) + j F{u)K'^Jt-u)d{GN-G) (u) 



and hence 

sup f^- it) -fit) = Op (tV-Vs^i^) . 
It can be proved in a similar way that 

sup F^-^^it)-Fit) = Op (tV-^/^v^I^) . 
The bootstrap test statistic now has the representation 



(8.14) 



2to 



h*Niit) log + {~gMt) - h*^,it)} log 

a,b] I -f;vW 1--^JV 



te[a,b] 

2n 



1 - F*nM 
it) 



dt 



dt, 



where 

~h*NAt) = j S*Kb^it-u)dPNjiu,6*),j = 1,2, 
and the A* are defined by 

for independent random variables U* , . . . ,U^, independent of the random variables (Tj,Aj), 
i = 1, . . . ,N, and where we may assume, as before, that 

j6*K,,{t-u)dFN,iu,5*) 

9Nj{t) 

Note that the only extra randomness is introduced by the uniform random variables U* , and that 
the bandwidth 6jv, used here, may be smaller than the bandwidth 6jv, used in the computation 
of Fj^ f^^. In fact, 6jv is the bandwidth which is used in the original sample and we have, by 
assumption 

bN - N-'', 

where 1/3 < a < 1/5, and where wc allow a = 1/5 if it is assumed that gi = 92- The densities 
gpfj have been computed in the original sample, using this possibly smaller bandwidth bjsf. 
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We now get, similarly as in Lemma 8.2, 



1 2 



y„ = UnPn / , , , , , — 7T-—T dt + Op 



where 

Sn = aNQi + /Sat 52, 

and 

9N2{t)h%i{t) - gNi{t)h*N2{t) 

N m 

i=m-\-l i=l 

m n 

-m-^^K,,{t-Ti)n-' ^ A*Kk,{t-Ti) 

i=l i=m-\-l 

N m 

= n-i KbAt-Ti)m-'Y.{^N:bjTi)-F^~,Jt)}K,,{t~T,) 

i=m-\-l i=l 



i=l i=m-\-l 
N m 

i=m+l i=l 

m n 



m 

i—l i— m+1 

This is the same decomposition as used in the proof of Lemma 8.2, but with F replaced by 
-^iVbjv ^^'-^ replaced by A*. Instead of Wn, defined by (8.12), we get: 



WUt) = ~9N2{t) ^{A* - F^-,JTi)}KtAt - T,) 

i=l 

n 

-gNi{t)n-' Y {At-F^~,JT,)}K,,{t-T,). 



i=7n+l 



and 



where 



Moreover, 



{~9N2{t)h%,it) - gNlit)h*^2{t)y = + SUt)f + Op 



S*Nit)=b%fj,;,Jt){g,{t)g'^{t)-g[{t)g2{t)} / u'K{u)du. 



„ f {9N2{t)h*Ni{^) - 9Ni{t)h*^S)Y b-a [^,.2. 

oiNPN / -= — = dt — — - — / K(u) du 

i^;v,6. W{1 - F^,i^mN{t)g,{t)g,{t) Nb^J 

= A*j, + B%-C: + D^ + Op(^j^y (8.15) 
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where 



l<i<j<m 

rb 



L 



g^{t)K,,{t-T,)K,,{t-T,) 



Tn<i<j<N 



^-='^E E {^i-PN,lJT,)}{A*-F,^,JT,)} 



_ _g^{t)K,,{t-T,)K,,{t-T,) 

Jt=a 35 

m N 



t=a g2{t)gN{t)Fj,iJt){l - Fj,^,Jt)} 



dt, 



mn , . 

1=1 j=m+l 



dt, 



and the bias term £)jv is given by: 



Note (again) that L'Ar = if .gi = (72- 

However, the distribution function does not satisfy the condition that the second 

derivative is uniformly bounded on an interval (a', b'), containing [a, 6], which is a condition on 
F in Theorems 2.1 to 2.3. But a scrutiny of the proof of Lemma 8.2 reveals that this condition 
was only needed to take care of the bias term 

N m 

Yl ^b«(i-ri)m-i^{F(r,)-F(f)}i^b„(f-ri) 

i=m-\-l 1=1 

m n 

-m-i^Kb„(i-Ti)n-i {PiTi)-F{t)}Kb^{t-Ti), 

i=l i=m+l 

see (8.10), which in the present case transforms into 

N m 

Y K'>At-Ti)m-'Y{^N:bjTi)-F^-,Jt)}K,,{t-Ti) 

i—m-\-l i—1 

m n 

-m-i^i^,,(t-T,)n-i Yl {PN,hjTi)-F^-,Jt)}K,,{t-Ti), 

1=1 i=m+l 

since we do not change the Ti of the original samples. But since 

I SKI it - u) dF^iu, 5) = j^Y ^" (^) = Op (V^) , 
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uniformly in t G [a. using again the methods of Lemma 8.1 together with the assumption 
that F, gi and 92 are twiee eontinuously diff'ercntiable, the remainder term 0{hj^) in (8.10) ean 
be replaeed by a remainder term of order Op{b% log N), whieh is sufHeient for our purposes. 
Theorem 4.1 now follows. □ 
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